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g\nec\l 1 96\1 39 1 9.spec\l 39 1 9.ts 
LAMBERTIAN REFLECTANCE AND LINEAR SUBSPACES 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates generally to computer vision and, more particularly, 
to image recognition and model reconstructions systems. 

2. Prior Art 

One of the most basic problems in vision is to understand how variability in 
lighting affects the images that an object can produce. Even when lights are isotropic and 
relatively far from an object, it has been shown that smooth Lambertian objects can produce 
infinite-dimensional sets of images. 

It has been very popular in object recognition to represent the set of images that an object can 
produce using low dimensional linear subspaces of the space of all images. There are those in 
the art who have analytically derived such a representation for sets of 3D points undergoing 
scaled orthographic projection. Still others have derived a 3D linear representation of the set of 
images produced by a Lambertian object as lighting changes, though this simplified 
representation assigns negative intensities in places where the surface normals are facing away 
from the light. Others have used factorization to build 3D models using this linear 
representation. Yet still others have extended this to a 4D space by allowing for a diffuse 
component to lighting. These analytically derived representations have been restricted to fairly 
simple settings; for more complex sources of variation researchers have collected large sets of 
images and performed Principal Component Analysis (PCA) to build representations that capture 
within class variations and variations in pose and lighting. PCA is a numerical technique that 
finds the linear subspace that best represents a data set. Given a large set of images, PCA finds 
the low-dimensional linear subspace that fits them most closely. Experiments have been 
performed by those in the art that show that large numbers of images of real objects, taken with 
varied lighting conditions, do lie near a low-dimensional linear space, justifying this 



representation. More recently, non-linear representations have been used which point out that 
when lighting is restricted to be positive, an object's images occupy a convex volume. A. 
Georghiades et al., "Illumination Cones for Recognition Under Variable Lighting: Faces", 
CVPR 98: 52-59, 1998 and A. Georghiades et al, "From Few to Many: Generative Models for 
5 Recognition Under Variable Pose and Illumination", Int. Conf. on Automatic Face and Gesture 
Recognition 2000, 2000 (collectively referred to as "Georghides") uses this representation for 
object recognition. 

Spherical harmonics has been used in the graphics literature to efficiently 
represent the bi-directional reflection function (BRDF) of different materials. It has been 
10 proposed to replace the spherical harmonics basis with a different basis that is more suitable for a 
half sphere. M. D r Zmoura, 1991. "Shading Ambiguity: Reflectance and Illumination," in 
Computational Models of Visual Processing, edited by M. Landy, and J. Movshon (hereinafter 
p "D'Zmoura") pointed out that the process of turning incoming light into reflection can be 
^ ji described in terms of spherical harmonics. With this representation, after truncating high order 
P 15 components, the reflection process can be written as a linear transformation, and so the low order 
Iqi components of the lighting can be recovered by inverting the linear transformation. D'Zmoura 
™ used this analysis to explore ambiguities in lighting. The present invention extends the work of 
n D'Zmoura by deriving subspace results for the reflectance function, providing analytic 

I s : descriptions of the basis images, and constructing new recognition algorithms that use this 
j =l20 analysis while enforcing non-negative lighting. Georghiades and D'Zmoura are incorporated 
p herein by their reference. 

In view of the prior art, there is a need for a computer vision system which shows 
how to analytically find low dimensional linear subspaces that accurately approximate the set of 
images that an object can produce from which portions of these subspaces can be carved out 
25 corresponding to positive lighting conditions. These descriptions can then be used for both 
recognition and model-building. 

SUMMARY OF THE INVENTION 
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Therefore it is an object of the present invention to provide a method for choosing 
an image from a plurality of three-dimensional models which is most similar to an input image 
which overcomes the disadvantages of the prior art methods. 

It is a further object of the present invention to provide a method for choosing an 
5 image from a plurality of three-dimensional models which is most similar to an input image 
which can be performed more efficiently and more rapidly than the methods of the prior art. 

It is yet a further object of the present invention to provide a method for choosing 
an image from a plurality of three-dimensional models which is most similar to an input image 
which can be performed without complex iterative optimization techniques. 

1 0 Variations in lighting can have a significant impact on the appearance of an 

^ object. The present invention provides a novel characterization of this variability for the case of 
Q Lambertian objects. A Lambertian object is one having a surface which reflects light according 
*i to Lambert's law, see J. Lambert. "Photometria Sive de Mensura et Gradibus Luminus, Colorum 
JjJ et Umbrae", Eberhard Klett, 1760. First, lighting is represented using spherical harmonics, the 
□ 1 5 effects of Lambertian materials is described as the analog of a convolution; this is similar to 
' 8 working in the frequency domain in signal processing. It is then possible to show that almost all 
J* the appearance of Lambertian objects is determined by the first nine components of the lighting 
p when represented as spherical harmonics. All reflectance functions (the mapping from surface 
j:| normal to intensity) produced by Lambertian objects can be proved to lie close to a 9D linear 
p20 subspace, explaining prior empirical results. The present invention also provides a simple 

analytic characterization of the linear space of images an object can produce. This can be readily 
used in object recognition algorithms based on linear methods, or that enforce non-negative 
lighting functions using convex optimization. In the case where a 4D linear approximation of an 
object's images is sufficient, the present invention shows that non-negative lighting can be 
25 enforced very simply. 

The present invention analyzes the set of reflectance functions produced under 
Lambert's model for arbitrary configurations of lights. It is shown that such reflectance functions 
are produced through the analog of a convolution of the light with a kernel that acts essentially as 
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a low pass filter. The present invention uses this and the non-negativity of light to prove 
analytically that under common lighting conditions, a nine-dimensional linear subspace, for 
example, accounts for 99.2% of the variability in the reflectance function. In the worst case this 
9D space accounts for 98% of the variability. This suggests that in general the set of images of a 
convex, Lambertian object can be approximated accurately by a low dimensional linear space. 
The present invention further shows how to analytically derive this subspace for an object model. 
This result sheds new light on existing recognition algorithms and leads to a number of new, 
efficient algorithms for recognition and model construction under varying light and pose. 

Accordingly, a method for choosing an image from a plurality of three- 
dimensional models which is most similar to an input image is provided. The method comprises 
the steps of: (a) providing a database of the plurality of three-dimensional models; (b) providing 
an input image; (c) positioning each three-dimensional model relative to the input image; (d) for 
each three-dimensional model, determining a rendered image that is most similar to the input 
image by: (d)(i) computing a linear subspace that describes an approximation to the set of all 
possible rendered images that each three-dimensional model can produce under all possible 
lighting conditions where each point in the linear subspace represents a possible image; and one 
of (d)(ii) finding the point on the linear subspace that is closest to the input image or finding a 
rendered image in a subset of the linear subspace obtained by projecting the set of images that 
are generated by positive lights onto the linear subspace; (e) computing a measure of similarly 
between the input image and each rendered image; and (f) selecting the three-dimensional model 
corresponding to the rendered image whose measure of similarity is most similar to the input 
image. Step (d) is preferably repeated for each of a red, green, and blue color component for 
each three-dimensional model. The linear subspace is preferably either four-dimensional or 
nine-dimensional. 

BRIEF DESCRIPTION OF THE DRAWINGS 

These and other features, aspects, and advantages of the apparatus and methods of 
the present invention will become better understood with regard to the following description, 
appended claims, and accompanying drawings where: 

Figure 1 illustrates a graph representation of the coefficients of Equation (9). 
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Figure 2 illustrates a ID slice of the Lambertian kernel and its various 
approximations. 

Figure 3 illustrates the first nine harmonic images derived from a 3D model of a 

face. 

5 Figure 4 illustrates test images used in the experiments of the methods of the 

present invention. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

When light is isotropic and distant from an object, it can be characterized by 
describing intensity as a function of direction. The set of all possible lighting conditions, then, is 

10 equivalent to the set of all functions that are everywhere positive on the surface of a sphere. The 
approach of the present invention begins by adopting a representation of these functions using 
surface spherical harmonics. This is analogous to Fourier analysis, but on the surface of the 
sphere. Spherical harmonics describe functions contained in the unit sphere, surface harmonics 
are restrictions of these functions to the sphere's surface. To model the way surfaces turn light 

1 5 into an image the present invention looks at reflectance as a function of the surface normal 
(assuming unit albedo). The present invention shows that reflectance functions are produced 
through the analog of a convolution of the lighting function with a kernel that represents 
Lambert's reflection. D'Zmoura used such an analysis to detect ambiguities in the appearance of 
objects due to lighting variations. A bit of complexity is added to this simple view to describe 

20 what happens with objects made of non-constant materials, and to handle non-Lambertian 
reflectance functions. 

With this view, it can be shown that a Lambertian kernel is a low-pass filter, and 
that this filter can be described analytically. Therefore, it can be shown analytically that for 
many common lighting conditions, much of the variation in appearance of an object depends on 
25 the first four components of the harmonic transform of the lighting function and almost all the 
variation is accounted for by the first nine components. In fact, the quality of the approximation 
deteriorates very little even when the light includes significant high frequency patterns. Lower 
bounds on the quality of the approximation under any lighting function are derived. This 
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provides an understanding from first principles of the empirical observation that most images of 
an object lie near a low-dimensional linear subspace. Moreover, this linear subspace can be 
derived analytically from a model, while past efforts relied on performing PCA on a large set of 
rendered images. 

5 This analytic understanding of how to linearly approximate lighting variations 

forms the core of a number of results. First, it allows better evaluation of the utility of several 
existing recognition and model construction methods. For example, it can be shown that the 
linear subspace method of the prior art in fact are based on using a linear space spanned by the 
three first order harmonics, but that it omits the significant DC component. Secondly, it leads to 
10 new methods of recognizing objects with unknown pose and lighting conditions. In particular, 
an algorithm for recognition under varying pose and illumination is presented that works in an 
analytically derived low-dimensional space. Finally, for cases in which a 4D linear subspace 
provides an adequate approximation, recognition can be performed very efficiently, without 
complex, iterative optimization techniques. 

1 5 MODELING IMAGE FORMATION 

Consider a convex object illuminated by distant isotropic light sources. Assume 
further that the surface of the object reflects light according to Lambert's law. This relatively 
simple model has been analyzed and used effectively in a number of vision applications. This 
analysis can be extended to non-Lambertian objects. The set of images of a Lambertian object 

20 obtained with arbitrary light has been termed "the illumination cone" by some in the art. The 
objective of the present invention is to analyze properties of the illumination cone. For the 
analysis it will be useful to consider the set of reflectance functions obtained under different 
illumination conditions. A reflectance function (also called a reflectance map Horn) associated 
with a specific lighting configuration is defined as the light reflected by a sphere of unit albedo 

25 as a function of the surface normal. A reflectance function is related to an image of a convex 
object illuminated by the same lighting configuration by the following mapping. Every visible 
point on the object's surface inherits its intensity from the point on the sphere with the same 
normal, and this intensity is further scaled by the albedo at the point. The effect of this mapping 
is discussed below. 
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Image Formation as the Analog of a Convolution 

Let S denote a unit sphere centered at the origin. Let p = (x, y, z) denote a point on 
5 the surface of S 9 and let N p = (jc, y, z) denote the surface normal at p. p can also be expressed as a 
unit vector using the following notation: 



where 0<8<7t and 0 < 0 < 2 n. In this coordinate frame the poles are set at (0,0, ± 1), 9 denotes 
10 the solid angle between p and (0,0,1), and it varies with latitude, and 0 varies with longitude. 
Since it is assumed that the sphere is illuminated by a distant and isotropic set of lights all points 
on the sphere see these lights coming from the same directions, and they are illuminated by 
identical lighting conditions. Consequently, the configuration of lights that illuminate the sphere 
can be expressed as a non-negative function & (0, 0), expressing the intensity of the light reaching 
15 the sphere from each direction (<9, 0). Furthermore, according to Lambert's law the difference in 
the light reflected by the points is entirely due to the difference in their surface normals. Thus, 
the light reflected by the sphere can be expressed as a function r (9, 0) whose domain is the set of 
surface normals of the sphere. 

According to Lambert's law, if a light ray of intensity / reaches a surface point 
20 with albedo X forming an angle 9 with the surface normal at the point, then the intensity reflected 
by the point due to this light is given by 



Here it is assumed without loss of generality (WLOG) that X = 1 . If light reaches a point from a 
multitude of directions then the light reflected by the point would be the sum of (or in the 
25 continuous case the integral over) the contribution for each direction. Denote by k (9) = max 
(cos 9 , 0), then, for example, the intensity of the point (0,0,1) is given by: 




(1) 



//Lmax(cos#,0) 



(2) 
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r(0,0) = J * [ k(o)l(0,<f>)sin0d6d&. (3) 

Similarly, the intensity r (0, 0) reflected by a point p = (6 , 0) is obtained by centering about p 
and integrating its inner product with /over the sphere. Thus, the operation that produces r (<9, 0) 
is the analog of a convolution on the sphere. This is referred to as a convolution, and thus: 

5 r(8 9 0) = k*& (4) 

The kernel of this convolution, k 9 is the circularly symmetric, half-cosine function. The 
convolution is obtained by rotating k so that its center is aligned with the surface normal at p. 
This still leaves one degree of freedom in the rotation of the kernel undefined, but since k is 
rotationally symmetric this ambiguity disappears. 

1 0 Properties of the Convolution Kernel 

Just as the Fourier basis is convenient for examining the results of convolutions in 
the plane, similar tools exist for understanding the results of the analog of convolutions on the 
sphere. The surface spherical harmonics are a set of functions that form an orthonormal basis for 
the set of all functions on the surface of the sphere. These functions are denoted by h nm with n = 
15 0,1,2,... and -n<m<n: 



U^ P"/'^"-^ ^^. (5) 



where P nm are the associated Legendre functions, defined as 



Pjz) J~ Z > d + z 2 -l*. (6) 
nm V ; 2 n n\ dz Il+m V 7 



The kernel, £, and the lighting function, are expressed as harmonic series, that is, as linear 
20 combinations of the surface harmonics. This is done primarily so that advantage can be taken of 
the analog to the convolution theorem for surface harmonics. An immediate consequence of the 
Funk-Hecke theorem (see, e.g., H. Groemer, Geometric applications of Fourier series and 
spherical harmonics, Cambridge University Press.) is that "convolution" in the function domain 

is equivalent to multiplication in the harmonic domain. As discussed below, a representation of 
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k is derived as a harmonic series. This derivation is used to show that k is nearly a low-pass 
filter. Specifically, almost all of the energy of k resides in the first few harmonics. This will 
allow us to show that the possible reflectances of a sphere all lie near a low dimensional linear 
subspace of the space of all functions defined on the sphere. 

A representation of k as a harmonic series can then be derived. In short, since k is 
rotationally symmetric about the pole, under an appropriate choice of a coordinate frame its 
energy concentrates exclusively in the zonal harmonics (the harmonics with m = 0), while the 
coefficients of all the harmonics with m ^ 0 vanish. Thus, k can be expressed as: 

00 

* = £*Ao- (7) 

J7=0 

The Lambertian kernel is given by k (9) = max (cos 9, 0), where 9 denotes the solid angle 
between the light direction and the surface normal. The harmonic transform of k is defined as 

00 77 

n=0 m--n 

where the coefficients a nm are given by 

WLOG, the coordinate system on the sphere is set as follows. One of the poles is positioned at 
the center of k, 9 then represents the angle along a longitude and varies from 0 to x 9 and 0 
represents an angle along a latitude and varies from 0 to 2tt. In this coordinate system k is 
independent of 0 and is rotationally symmetric about the pole. Consequently, all its energy is 
split between the zonal harmonics (the harmonics with m = 0), and the coefficients for every m ± 
0 vanish. 

An explicit form for the coefficients is then determined. First, one can limit the 
integration to the positive portion of the cosine function by integrating over 9 only to tt/2, that is, 



k -=l* f ™s0h nm (O,0)sm(MGd<{>. 
Next, since only the m = 0 components do not vanish, denote k n = k n0 , then 



k n = J o f cos 0h nO (#)sin 0d6d</> = 2nf Q cos 0h nO (6>)sin 0d0. 



Now, 



An 



where (z) is the associated Legendre function of order n defined by 



P.(') = — — 



Substituting z = cos 0 one obtains 



Ar„=V(2« + l>rf o 1 zP„(z>fe. 



1 0 Now turning to computing the integral 



This integral is equal to 



ir 1 z ii( z >_i)" Jz . 



Integrating by parts yields 



15 



Vn\ 



The first term vanishes and one is left with 
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"«!! J ° dz 11 -' V ' 2 n n\ dz V n ° 



This formula vanishes for z = i and so one obtains 



1 d 



n-2 



Now, 



(- 2 -i)"=E(;)(->r."- 



k=0 



When the - 2 derivative is taken, all terms whose exponent is less than n - 2 disappear. 
Moreover, since the derivative at z = 0 is evaluated, all the terms whose exponent is larger than n 
- 2 vanish. Thus, only the term whose exponent is 2k = n - 2 survives. Denote the n - 2 
coefficient by b n .2 9 then, when n is odd b n . 2 =0, and when n is even 



K-2 = 



vf-'y 



(-1) 



In this case 



d 



^(- 2 -l)l.o^-2K- 2 =("-2)f''l(-l) 



dz 



and one obtains 



J o 2"«! 2"(f-l)(f + l) 



'(f-l)(f + l) 

15 The above derivation holds for n > 2. The special cases that n = 0 and « = 1 

should be handled separately. In the first case P 0 (z) = 1 and in the second case Pi(z). For n - 0 
the integral becomes 



Jzdz = — , 
o 2 
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and for n = 1 it becomes 



f z 2 dz-—. 
Jo 3 



Consequently, 



(-l) t+1 («-2)V(2n + l>m 



2"fe-l)(f + l) 



/i = 0 

n=l 

«> 2, eve/? 
n>2,odd 



After this tedious manipulation it is obtained that 



k. =\ 



4k 

2 



{-\)z +l (n-2)U(2n+\)x 



n = 0 
n = \ 

n > 2, even 
n > 2, odd 



(8) 



The first few coefficients, for example, are 

k 0 = & « 0.8862 £ 2 = ^ « 0.4954 
K =$ £ 4 =-#»^0.H08 

(£j = £5 = £7 = 0) A graph representation of the coefficients is shown in Figure 1 . Figure 1 
1 0 shows from left to right: a graph representation of the first 1 1 coefficients of the Lambertian 
kernel; the relative energy captured by each of the coefficients; the accumulated energy; and a 
zoom in to the accumulated energy. 



_ Jl3^ ^ 1 
128 

k fl = W « 0.0285. 



k 6 = ^-0.0499 



(9) 



256 
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The energy captured by every harmonic term is measured commonly by the 
square of its respective coefficient divided by the total squared energy of the transformed 
function. The total squared energy in the half cosine function is given by 

f \l ^W^^^t ™* 2 0*^0=2^ z2dz =Y' (10) 

5 Table 1 shows the relative energy captured by each of the first several coefficients. The top row 
of Table 1 shows the energy captured by the w'th zonal harmonic for the Lambertian kernel (0 < 
n < 8). The middle row of Table 1 shows the energy accumulated up to the order n. This energy 
represents the quality of the n'th order approximation of r(6, (J)) (measured in relative squared 
error). The bottom row shows a lower bound on the quality of this approximation due to the 

10 non-negativity of the light. The n - 3, 5, and 7 are omitted because they contribute no energy. 
The relative energies shown in Table 1 are given in percents. It can be seen that the kernel is 
dominated by the first three coefficients. Thus, a second order approximation already accounts 
for 99.22% of the energy. With this approximation the half cosine function can be written as: 

k{9) = meLx{cos9fi)*^h 0Q +^h l0 +fih 20 = ^ + jcos6^ + ^cos2^. (11) 

1 5 The quality of the approximation improves somewhat with the addition of the fourth order term 
(99.81%) and deteriorates to 87.5% when a first order approximation is used. Figure 2 shows a 
ID slice of the Lambertian kernel and its approximations of first, second, and third order, 
respectively, from left to right. 



TABLE 1 



n 


0 


1 


2 


4 


6 


8 


Energy 


37.5 


50 


11.72 


0.59 


0.12 


0.04 


Accumulated energy 


37.5 


87.5 


99.22 


99.81 


99.93 


99.97 


Lower bound 


37.5 


75 


97.96 


99.48 


99.80 


99.90 



20 

Linear Approximations of the Reflectance Function 
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The fact that the Lambertian kernel has most of its energy concentrated in the low 
order terms implies that the set of reflectance functions of a sphere of unit albedo can be well 
approximated by a low dimensional linear space. This space is spanned by a small set of what 
are called harmonic reflectances. The harmonic reflectance r nm (0, 0) denotes the reflectance of 
5 the sphere when it is illuminated by the harmonic "light" h nm . Note that harmonic lights 
generally are not positive everywhere, so they do not correspond to real, physical lighting 
conditions; they are abstractions. As is explained below every reflectance function r (#, 0) will 
be approximated to an excellent accuracy by a linear combination of a small number of harmonic 
reflectances. 

10 To evaluate the quality of the approximation consider first, as an example, 

lighting generated by a point source at the z direction (0 = 0 = 0). A point source is a delta 
function. The reflectance of a sphere illuminated by a point source is obtained by a convolution 

O 

s% of the delta function with the kernel, which results in the kernel itself. Due to the linearity of the 
.!? convolution, if the reflectance due to this point source is approximated by a linear combination 
|;1 15 of the first three zonal harmonics, r 00 , r }0 , and r 2 o, 99.22% of the energy is accounted for 

min 

!\ (ao»*i>«2) 

where k 9 the Lambertian kernel, is the reflectance of the sphere when it is illuminated by a point 
O source at the z direction. Similarly, first and fourth order approximations yield respectively 
87.5% and 99.8 1% accuracy. 

20 If the sphere is illuminated by a single point source in a direction other than the z 

direction the reflectance obtained would be identical to the kernel, but shifted in phase. Shifting 
the phase of a function distributes its energy between the harmonics of the same order n (varying 
m) 9 but the overall energy in each n is maintained. The quality of the approximation, therefore, 
remains the same, but now for an Mh order approximation is needed to use all the harmonics 

25 with n < N for all m. Recall that there are 2n + 1 harmonics in every order n. Consequently, a 
first order approximation requires four harmonics. A second order approximation adds five more 
harmonics yielding a 9D space. The third order harmonics are eliminated by the kernel, and so 
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\\k\\ 



= 0.9922, 



they do not need to be included. Finally, a fourth order approximation adds nine more harmonics 
yielding an 18D space. 

It has been seen that the energy captured by the first few coefficients k(l<i<N) 
directly indicates the accuracy of the approximation of the reflectance function when the light 
5 includes a single point source. Other light configurations may lead to different accuracy. Better 
approximations are obtained when the light includes enhanced diffuse components of low- 
frequency. Worse approximations are anticipated if the light includes mainly high frequency 
patterns. 

It turns out, however, that even if the light includes mostly high frequency 
10 patterns the accuracy of the approximation is still very high. This is a consequence of the non- 
negativity of light. A lower bound on the accuracy of the approximation for any light function 
can be derived as follows. It is simple to show that for any non-negative function the amplitude 
of the DC component must be at least as high as the amplitude of any of the other components. 
One way to see this is by representing such a function as a non-negative sum of delta functions. 
1 5 In such a sum the amplitude of the DC component is the weighted sum of the amplitudes of all 
the DC components of the different delta functions. The amplitude of any other frequency may 
at most reach the same level, but often will be lower due to interference. Consequently, in an 
N"th order approximation the worst scenario is obtained when the amplitudes in all frequencies 
higher than N saturate to the same amplitude as the DC component, while the amplitude of 
20 orders 1 < n < N are set to zero. In this case the relative squared energy becomes 



Table 1 shows the bound obtained for several different approximations. It can be seen that using 
a second order approximation (involving nine harmonics) the accuracy of the approximation for 
any light function exceeds 97.96%. With a fourth order approximation (involving 18 harmonics) 
25 the accuracy exceeds 99.48%%. Note that the bound computed in Equation 13 is not tight, since 
the case that all the higher order terms are saturated may (and in fact in general will) yield a 
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function with some negative values. Consequently, the worst case accuracy may even be higher 
than the bound. 

Generating Harmonic Reflectances 

Constructing a basis to the space that approximates the reflectance functions is 
5 straightforward and can be done analytically. To construct the basis the Funk-Hecke theorem is 
invoked. Recall that this space is spanned by the harmonic reflectances, i.e., the reflectances 
obtained when a unit albedo sphere is illuminated by harmonic lights. These reflectances are the 
result of convolving the half cosine kernel with single harmonics. Due to the orthonormality of 
the spherical harmonics such a convolution cannot produce energy in any of the other harmonics, 
1 0 Consequently, denote the harmonic light by h nm , then the reflectance due to this harmonic is the 
same harmonic, but scaled. Formally, 

r = k*h =c h . (14) 

nm nm n nm v ' 

It can be readily verified that the harmonics of the same order n but different phase m share the 
same scale factor c n . It is therefore left to determine c n . 

1 5 To determine c n the fact that the half-cosine kernel k is an image obtained when 

the light is a delta function centered in the z direction is used. The transform of the delta 
function is given by 

"ZFZr** (15) 

and the image it produces is 

00 

20 * = £*Ao> < 16 > 

«=0 

where the coefficients k n are given in Equation 8. c„ determines by how much the harmonic is 
scaled following the convolution; therefore, it is the ratio between k n and the respective 
coefficient of the delta function, that is, 
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The first few harmonic reflectances are given by 



r oo ~~ ^00 
r ——h 



r — —h 

r ~ JL.U 
Urn 24 n 4m 



r — JL. h 

f 6m ~~ 64 n 6m 
'8m 128 "8m 



(18) 



5 for -n < m < n (and r 3m = r 5m = r 7m = 0). 

For the construction of the harmonic reflectances it is useful to express the 
harmonics using space coordinates (x,y,z) rather than angles (0, 0). This can be done by 
substituting the following equations for the angles: 

6 - cos -1 z 



^tan-'f 



1 0 The first nine harmonics then become 
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h ^- L - 



hy 



\2n 



XZ 



^22 = 1 (• 



* 2 -y 2 ) 



(20) 



K\ - ^fh x ^21 ~ 

where the superscripts e and o denote the even and the odd components of the harmonics 
respectively {h nm = h e n ^ ± ih°^ , according to the sign of m; in fact the even and odd versions of 

the harmonics are more convenient to use in practice since the reflectance function is real). 
15 Notice that the harmonics are simply polynomials in these space coordinates. As discussed 
below, h nm (6 0) and h nm (x, y 9 z) are invariably used to denote the harmonics expressed in 
angular and space coordinates respectively. 

From Reflectances to Images 

Up to this point the reflectance functions obtained by illuminating a unit albedo 
20 sphere by arbitrary light have been analyzed. The objective of the present invention is to use this 
analysis to efficiently represent the set of images of objects seen under varying illumination. An 
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image of an object under certain illumination conditions can be constructed from the respective 
reflectance function in a simple way: each point of the object inherits its intensity from the point 
on the sphere whose normal is the same. This intensity is further scaled by its albedo. In other 
words, given a reflectance function r (x, y, z), the image of a point p with surface normal n = (n x , 
5 n y , n z ) and albedo X is given by 



How the accuracy of this low dimensional linear approximation to a model's images can be 
affected by the mapping from the reflectance function to images will now be discussed. Two 
points will be made. First, in the worst case, this can make this approximation arbitrarily bad. 
10 Second, in typical cases it will not make this approximation less accurate. 



that there is a rearrangement in the x, y position of points. That is, a particular surface normal 
appears in one location on the unit sphere, and may appear in a completely different location in 
the image. This rearrangement has no effect on this approximation. Images are represented in a 
15 linear subspace in which each coordinate represents the intensity of a pixel. The decision as to 
which pixel to represent with which coordinate is arbitrary, and changing this decision by 
rearranging the mapping from (x,y) to a surface normal merely reorders the coordinates of the 
space. 



20 functions is that occlusion, shape variation and albedo variations effect the extent to which each 
surface normal on the sphere helps determine the image. For example, occlusion ensures that 
half the surface normals on the sphere will be facing away from the camera, and will not produce 
any visible intensities. A discontinuous surface may not contain some surface normals, and a 
surface with planar patches will contain a single normal over an extended region. In between 

25 these extremes, the curvature at a point will determine the extent to which its surface normal 

contributes to the image. Albedo has a similar effect. If a point is black (zero albedo) its surface 
normal has no effect on the image. In terms of energy, darker pixels contribute less to the image 
than brighter pixels. Overall, these effects are captured by noticing that the extent to which the 
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There are two components to turning a reflectance function into an image. One is 



The second and more significant difference between images and reflectance 



-18- 



reflectance of each point on the unit sphere influences the image can range from zero to the 
entire image. 



An example will be given to show that in the worst case this can make this 



approximation arbitrarily bad. First, one should notice that at any single point, a low-order 
5 harmonic approximation to a function can be arbitrarily bad (this can be related to the Gibbs 
phenomena in the Fourier domain). Consider the case of an object that is a sphere of constant 
albedo. If the light is coming from a direction opposite the viewing direction, it will not 
illuminate any visible pixels. The light can be shifted slightly, so that it illuminates just one 
pixel on the boundary of the object; by varying the intensity of the light this pixel can be given 
1 0 any desired intensity. A series of lights can do this for every pixel on the rim of the sphere. If 
there are n such pixels, the set of images gotten fully occupies the positive orthant of an n- 
dimensional space. Obviously, points in this space can be arbitrarily far from any 9D space. 
What is happening is that all the energy in the image is concentrated in those surface normals for 
which the approximation happens to be poor. 

15 However, generally, things will not be so bad. In general, occlusion will render 

an arbitrary half of the normals on the unit sphere invisible. Albedo variations and curvature will 
emphasize some normals, and de-emphasize others. But in general, the normals whose 
reflectances are poorly approximated will not be emphasized more than any other reflectances, 
and the approximation of reflectances on the entire unit sphere is expected to be about as good 

20 over those pixels that produce the intensities visible in the image. 



carry on to the images of objects. Thus the set of images of an object is approximated by a linear 
space spanned by what is called harmonic images, denoted b nm . These are images of the object 
seen under harmonic light. These images are constructed as in Equation 21 as follows: 



Note that boo is an image obtained under constant, ambient light, and so it contains simply the 
surface albedo (up to a scaling factor). The first order harmonic images bj m are images obtained 
under cosine lighting centered at the three main axes. These images contain the three 



Therefore, it is assumed that the subspace results for the reflectance functions 



25 




(22) 



-19- 



components of the surface normals scaled by the albedos. The higher order harmonic images 
contain polynomials of the surface normals scaled by the albedo. Figure 3 shows the first nine 
harmonic images derived from a 3D model of a face. The top row contains the zero'th harmonic 
on the left and two of the first harmonic images. The second row, left, shows the third of the 
5 first harmonic images. The remaining images are images derived from the second harmonics. 



Recognition 

The present invention develops an analytic description of the linear subspace that 
10 lies near the set of images that an object can produce. It is then shown how to use this 

description to recognize objects. Although the method of the present invention is suitable for 
general objects, examples related to the problem of face recognition will be given by way of 
example only and not to limit the scope of the present invention. It is assumed that an image 
must be compared to a data base of models of 3D objects. It is also assumed that the pose of the 
15 object is already known, but that its identity and lighting conditions are not. For example, one 
may wish to identify a face that is known to be facing the camera. Or one may assume that 
either a human or an automatic system have identified features, such as the eyes and the tip of 
the nose, that allow us to determine pose for each face in the data base, but that the data base is 
too big to allow a human to select the best match. 

20 Recognition proceeds by comparing a new image to each model in turn. To 

compare to a model the distance between the image and the nearest image that the model can 
produce is computed. Two classes of algorithms are presented that vary in their representation of 
a model's images. The linear subspace can be used directly for recognition, or one can be 
restricted to a subset of the linear subspace that corresponds to physically realizable lighting 

25 conditions. 

The advantages gained by having an analytic description of the subspace available 
is stressed in the methods of the present invention, in contrast to previous methods in which PCA 
could be used to derive a subspace from a sample of an object's images. One advantage of an 

analytic description is that this provides an accurate representation of an object's images, not 
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subject to the vagaries of a particular sample of images. A second advantage is efficiency; a 
description of this subspace can be produced much more rapidly than PCA would allow. The 
importance of this advantage will depend on the type of recognition problem that is tackled. In 
particular, one is generally interested in recognition problems in which the position of an object 
5 is not known in advance, but can be computed at run-time using feature correspondences. In this 
case, the linear subspace must also be computed at run-time, and the cost of doing this is 
important. How this computation may become part of the inner loop of a model-building 
algorithm is discussed below, where efficiency will also be crucial. Finally, it will be shown that 
when a 4D linear subspace is used, the constraint that the lighting be physically realizable can be 
10 incorporated in an especially simple and efficient way. 

Linear methods 

The most straightforward way to use the prior results for recognition is to 
compare a novel image to the linear subspace of images that correspond to a model. To do this, 
the harmonic basis images of each model are produced. Given an image / a vector a is sought 

1 5 that minimizes \\Ba - 7||, where B denotes the basis images, B isp x r, p is the number of points in 
the image, and r is the number of basis images used. As discussed above, nine is a natural value 
to use for r, but r = 4 provides greater efficiency while r = 1 8 offers even better potential 
accuracy. Every column of B contains one harmonic image b nm . These images form a basis for 
the linear subspace, though not an orthonormal one. A QR decomposition is applied to B to 

20 obtain such a basis, Q. The distance from the image, /, and the space spanned by B as \\QQ T I - /|| 
can then be computed. The cost of the QR decomposition is 0(pr 2 ), assuming p »r. 

In contrast to this, prior methods have sometimes performed PCA on a sample of 
images to find a linear subspace representing an object. For example, Georghides renders the 
images of an object and finds an 1 ID subspace that approximates these images. When s sampled 
25 images are used (typically s » r), with s«p PCA requires 0(ps 2 ). Also, in MATLAB, PCA of 
a thin, rectangular matrix seems to take exactly twice as long as its QR decomposition. 
Therefore, in practice, PCA on the matrix constructed by the methods of Georghides would take 
about 150 times as long as using the method of the present invention to build a 9D linear 
approximation to a model's images. This may not be too significant if pose is known ahead of 
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time and this computation takes place offline. But when pose is computed at ran time, the 
advantages of the methods of the present invention can become very great. 

It is also interesting to compare the methods of the present invention to another 
linear method, due to A. Shashua. "On Photometric Issues in 3D Visual Recognition from a 
5 Single 2D Image", Int. J. of Comp. Vis., 21(1-2): 99-122, 1997. (hereinafter "Shashua"). 

Shashua points out that in the absence of attached shadows, every possible image of an object is 
a linear combination of the x, y and z components of the surface normals, scaled by the albedo. 
Shashua therefore proposes using these three components to produce a 3D linear subspace to 
represent a model's images. Notice that these three vectors are identical, up to a scale factor, to 
1 0 the basis images produced by the first harmonic in the methods of the present invention. 

While this equivalence is clear algebraically, it can also be explained as follows. 
The first order harmonic images are images of any object subjected to a lighting condition 
described by a single harmonic. The Funk-Hecke theorem ensures that all components of the 
kernel describing the reflectance function will be irrelevant to this image except for the first 
1 5 order components. In Shashua's work, the basis images are generated by using a point source as 
the lighting function, which contains all harmonics. But the kernel used is a full cosine function 
of the angle between the light and the surface normal. This kernel has components only in the 
first harmonic. So all other components of the lighting are irrelevant to the image. In either 
case, the basis images are due only to the first set of harmonics. 

20 Enforcing Positive Light 

When one takes arbitrary linear combinations of the harmonic basis images, one 
may obtain images that are not physically realizable. This is because the corresponding linear 
combination of the harmonics representing lighting may contain negative values. That is, 
rendering these images may require negative "light", which of course is physically impossible. It 
25 is now shown how to use the basis images while enforcing the constraint of non-negative light. 
There are those in the art who have shown that the set of images of an object produced by non- 
negative lighting is a convex cone in the space of all possible images. As discussed above, this is 
referred to as the illumination cone. It is also shown how to compute approximations to this 

cone in the space spanned by the harmonic basis images. 
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Specifically, given an image lit is attempted to minimize \\Ba - 7|| subject to the 
constraint that the light is non-negative everywhere along the sphere. A straightforward method 
to enforce positive light is to infer the light from the images by inverting the convolution. This 
would yield linear constraints in the components of a, Ha > 0, where the columns ofH contain 
5 the spherical harmonics h nm . Unfortunately, this naive method is problematic since the light may 
contain higher order terms which cannot be recovered from a low order approximation of the 
images of the object. In addition, the harmonic approximation of non-negative light may at 
times include negative values. Forcing these values to be non-negative will lead to an incorrect 
recovery of the light. As discussed below, a different method is discussed in which the 
10 illumination cone is projected onto the low dimensional space and use this projection to enforce 
non-negative lighting. 

A method that can use any number of harmonic basis images is first presented. A 
non-negative lighting function can be written as a non-negative combination of delta functions, 
each representing a point source. Denote by 8q 0 the function returning 1 at (#, 0) and 0 
15 elsewhere. This lighting function represents a point source at direction (#, 0). To project the 
delta function onto the first few harmonics one needs to look at the harmonic transform of the 
delta function. Since the inner product of d$ 0 with a function/ returns simply / (6, 0), one can 
conclude that the harmonic transform of the delta function is given by 

co n 

I KlO^)K m - (23) 

n=0 77i=— n 

20 The projection of the delta function onto the first few harmonics, therefore, is obtained by taking 
the sum only over the first few terms. 

Suppose now that a non-negative lighting function { 0) is expressed as a non- 
negative combination of delta functions 

7=1 
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for some s. Obviously, due to the linearity of the harmonic transform, the transform of His a 
non-negative combination of the transforms of the delta functions with the same coefficients. 
That is, 

* = t a jl t h M^X„r (25) 

7=1 «=0 m=~n 

5 Likewise, the image of an object illuminated by £ can be expressed as a non-negative 
combination as follows 

/= 2>,Z 5>-(mX. (26) 

y=l »=0 m--n 

where b nm = k n h nm (see previous section). 

Given an image, an objective of the present invention is to recover the non- 
1 0 negative coefficients aj. Assume an approximation of order N 9 and denote the number of 
harmonics required for spanning the space by r - r(N) (e.g., if N = 2 then r = 9). In matrix 
notation, the harmonic functions are denoted by H, H is s x r, where s is the number of sample 
points on the sphere. The columns of H contain a sampling of the harmonic functions, while its 
rows contain the transform of the delta functions. Further, B is denoted by the basis images, B is 
15 p x r, where p is the number of points in the image. Every column of B contains one harmonic 
image b nm . Finally, denote a T = (# . ., a s ). Then, the objective is to solve the non-negative least 
squares problem: 

min|LB# 7 a - 1\\ sJ.. a > 0. (27) 

One can further project the image to the r-dimensional space spanned by the harmonic images 
20 and solve the optimization problem in this smaller space. To do so a QR decomposition is 
applied to J?, so that B = QR, where Q is unitary and R is upper triangular. Keeping only r 
columns for Q y and multiplying the optimization function by Q T from the left 

minima - Q T l\\ sJ. a > 0. (28) 

Now R is r x r and Q T I is a r- vector. 
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Note that this method is similar to that presented in Georghides et al. The primary 



difference is that a low dimensional space constructed for each model using its harmonic basis 
images is worked in. Georghides et al. performs a similar computation after projecting all 
images into a 100-dimensional space constructed using PCA on images rendered from models in 
5 a ten-model data base. The methods of the present invention are inspired by the work of those in 
the art, but it is felt that it improves upon it by working in a space that can be analytically and 
efficiently constructed. Moreover, it is known that this space provides an accurate representation 
of a model's images. 

Recognition with Four Harmonics 

1 0 A further simplification can be obtained if the set of images of an object is 

approximated only up to first order. Four harmonics are required in this case. One is the DC 
component, representing the appearance of the object under uniform ambient light, and three are 
the basis images also used by Shashua. Again, ||j?a - 7|| (now B is p x 4) is attempted to be 
minimized subject to the constraint that the light is non-negative everywhere along the sphere. 

15 As before, the constraints are determined by projecting the delta functions onto 

the space spanned by the first four harmonics. However, now this projection takes a particularly 
simple form. Consider a delta function de 0 . Its first order approximation is given by 



n 
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n-Q m--n 



Using space coordinates this approximation becomes 



20 




1 (xsin#cos^ + j sin 0 sin (p + z cos 

47T An 



(30) 



Let 



(31) 



be the first order approximation of a non-negative lighting function {. f is a non-negative 
combination of delta functions. It can be readily verified that such a combination cannot 
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decrease the zero order coefficient relative to the first order ones. Consequently, any non- 
negative combination of delta functions must satisfy 



9af >aj_af + af. (32) 



(Equality is obtained when the light is a delta function, see Equation 30.) Therefore, one can 
5 express the problem of recognizing an object with a 4D harmonic space as minimizing \\Ba-I\\ 
subject to Equation 32. 

In the four harmonic case the harmonic images are just the albedos and the 
components of the surface normals scaled by the albedos, each scaled by some factor. It is 
therefore natural to use those directly and hide the scaling coefficients within the constraints. Let 
10 I be an image of the object illuminated by {, then, using Equations 1 8 and 22, 

I nm 0 A_^{a } An x +a 2 An y +a 3 An z ). (33) 

where X and (n X9 n y , n x ) are respectively the albedo and the surface normal of an object point. 
Using the unsealed basis images, X, Xn x , Xn y , and Xn z , this equation can be written as: 

/ ^b 0 A + b l An x +b 2 An y + b^Jin z , (34) 
1 5 with b 0 = 7ia 0 and b\ = a . (l < i < 3) Substituting for the afs one obtains 

' >-^(%+bl+%\ (35) 



n 1 An 
which simplifies to 

4b 2 0 >b*+b 2 2 +b>. (36) 

Consequently, to solve the 4D case the difference between the two sides of Equation 34 are 
20 minimized subject to Equation 36. 

It is now shown that finding the nearest image in the space spanned by the first 
four harmonic images with non-negative light can be transformed to a six degree polynomial 
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with a single variable, the Lagrange multiplier. With this polynomial solving the minimization 
problem becomes straightforward. 

Finding the nearest image in the 4D harmonic space subject to the constraint that 
the light is non-negative has the general form 

min||^x - Z>|| s.t. x T Bx = 0, 

with A{nxA\b{nx\) resides in the column space of A and B (4 x 4). In this representation the 
columns of A contain the harmonic images, b is the image to be recognized, and B = diag (4, - 1 , - 
1,-1). However, those skilled in the art will recognize that the methods of the present invention 
can also be used with an arbitrary nonsingular matrix B. 

First, one can solve the linear system min 

min|^4x - b\ 

and check if this solution satisfies the constraint. If it does, one is done. If not, one must seek 
the minimum that occurs when the constraint is satisfied at equality. The solution is divided into 
two parts. In the first part the problem is converted to the form: 

min||z - c\\ s.t z T Dz > 0, 

As discussed below, the new problem can be turned into a sixth order polynomial. 
Step 1: 

Define b' such that Ab* = b (this is possible because b lies in the column space of 
A). Then, Ax - b = A(x - b f ), implying that this problem is equivalent to: 

min||4* - b r )\s.L x T Bx = 0. 

Using the method presented in Golub and van Loan (see the second edition, pages 466-471, 
especially algorithm 8.7. 1) A T A and B are simultaneously diagnolized. This will produce a non- 
singular matrix X such that X T A T AX = / and X T BX =D,I denotes the identity matrix, and D is a 
4x4 diagonal matrix. Thus, one obtains 
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min|jT ' (x - b')\ s.t.x T x T X- T DX~ l x = 0. 

where X denotes the inverse of X and X denotes its transpose. Denote z = X x and c = X 
then one obtains min 

mm||z~c|| sJ. z T Dz = 0. 

This has the requested form. 
Step 2 : 

At this point the present invention attempts to solve a problem of the form 

min||z — c|j s.L z T Dz = 0. 
This minimization problem is solved using Lagrange multipliers. That is, 

min||z - c|| + Az 7 Dz = 0. 
Taking the derivatives with respect to x and X one gets 

z — c + ADz — 0, 

and 

z T Dz = 0. 

From the first equation one gets 

z = {i+my i c. 

Since Z) is diagonal the components of z are given by 

' l + M r 

The constraint z T Dz = 0 thus becomes 
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f c ' d ' =o 
h (i+m,) 2 ' 

which, after multiplying out the denominator, becomes a sixth degree polynomial in X. This 
polynomial can be efficiently and accurately solved using standard techniques (the MATLAB 
function roots is used). All solutions are plugged in to determine x, as indicated above, and 
5 choose the real solution that minimizes the optimization criteria. 

Experiments 

Experiments have been performed with the recognition methods of the present 
invention, using a subset of a data base of faces. This subset contains 3D models of ten faces, 
including models of their albedos in the red, green and blue color channels. As test images, 42 

10 images of a single individual are used, taken across seven different poses and six different 

lighting conditions (shown in Figure 4). In these experiments, each image is compared to each 
model, and the rank of the correct answer is determined (i.e., a rank of one means the correct 
answer was chosen first). This subset of the data base is too small to allow us to draw any 
definite conclusions from the experiments. Rather, it is small enough to allow us to compare a 

15 number of different methods, some of which are too slow to run on a large data set. 

In implementing all the methods, one must first obtain a 3D alignment between 
the model and the image (referred to as "positioning"). This can be done with existing methods 
known in the art, such as with the method disclosed in co-pending U.S. Patent Application, 
Serial Number (Attorney Docket Number NECI 1443, 1 3414), which is 

20 incorporated herein by its reference. In brief, features on the faces can be identified by hand, and 
then a 3D rigid transformation can be found to align the 3D features with the corresponding 2D 
image features. For instance, suppose the 3D models are models of many people's faces. Prior 
to recognition, a person can click on points on the face, indicating the location of features like 
the center of the eyes or the tip of the nose. When the input image arrives, a person can click on 

25 corresponding features in the input image. Given a match between the image features and the 
model features, for each model it can be determined the position of that object relative to the 
camera that best matches the model features to the input image features. Determining this 
positioning is a well-studied problem in the art, for which many solutions have been derived. 
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To determine lighting conditions, attention is only paid to image pixels that have 
been matched to some point in the 3D model of the face. Image pixels that are of maximum 
intensity are also ignored, since these may be saturated, and provide misleading values. Finally, 
both the model and the image are subsampled, replacing each m x m square with its average 
5 values. This is because some of the methods described below, especially Georghides', are too 
slow to run on full images. A slight variation of Georghides', method is implemented in order to 
avoid approximations. Each model is rendered using 100 different point sources. Then these 
100 images are projected, along with the test image, into a 101 D space, where non-negative least 
squares optimization is performed. This is equivalent to, but more efficient than performing the 

10 optimization in the full space. However, it still requires use of SVD on 101 images, which is too 
slow when performed on the entire images. In their experiments this Singular Value 
Decomposition (SVD) was performed off-line, but in these experiments, since pose is not known 
ahead of time, SVD must be performed for each pose, on-line. SVD is a standard method of 
decomposing a matrix that makes explicit its most important components. However, preliminary 

15 experiments indicate that the methods of the present invention subsample quite a bit without 
significantly reducing accuracy. In the experiments below, all algorithms subsampling are run 
with 16x16 squares. Some other algorithms have been run with less subsampling also. 

Once lighting conditions were determined on the subsampled image, they were 
used to render the model in an image of full size. So, for example, the methods of the present 

20 invention produce coefficients that tell us how to linearly combine the harmonic images to 

produce the rendered image. These coefficients were computed on the sampled image, but then 
applied to harmonic images of the full, unsampled image. This process was repeated separately 
for each color channel. Then, a model was compared to the image by taking the root mean 
squared error, derived from the distance between the rendered face model and the portion of the 

25 image that it intersected. 



TABLE 2 



Algorithm 


Percent correct 


Average rank 


Georghides et al. 


76 


1.7 


Non-negative light 4 


57 


2.4 
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Linear 9 


86 


1.2 


Non-negative light 9 


79 


1.2 


Linear 9 (sampling 8) 


88 


1.1 


Linear 9 (sampling 4) 


88 


1.1 



The results of these experiments are shown in Table 2. In Table 2, "Georghides" 
indicates results finding lighting by matching the image to the non-negative combination of 
images generated with point sources. <4 Non-Negative light 4 and 9" indicate the methods of the 
5 present invention, using four and nine-dimensional harmonic basis images, along with a 
constraint that the lighting be positive. "Linear 9" indicates the linear method of the present 
invention using a 9D space. Table 2 also shows the results of applying "Linear 9" with smaller 
subsampling. "Percent correct" indicates what fraction of the images were matched to the right 
answer. Table 2 also shows an "Average rank", where a rank of k indicates that the right answer 
1 0 was picked k'th (i.e., rank of one means the correct answer was chosen). One can see that the 
methods of the present invention that use a 9D harmonic basis are more accurate than existing 
methods. The 4D harmonic may be less accurate, but it is much more efficient than the other 
methods. 

While there has been shown and described what is considered to be preferred 
1 5 embodiments of the invention, it will, of course, be understood that various modifications and 
changes in form or detail could readily be made without departing from the spirit of the 
invention. It is therefore intended that the invention be not limited to the exact forms described 
and illustrated, but should be constructed to cover all modifications that may fall within the 
scope of the appended claims. 

20 
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WHAT IS CLAIMED IS: 

1 LA method for choosing an image from a plurality of three-dimensional models 

2 which is most similar to an input image; the method comprising the steps of: 

3 (a) providing a database of the plurality of three-dimensional models; 

4 (b) providing an input image; 

5 (c) positioning each three-dimensional model relative to the input image; 

6 (d) for each three-dimensional model, determining a rendered image that is most 

7 similar to the input image by: 

8 (i) computing a linear subspace that describes an approximation to the set 

9 of all possible rendered images that each three-dimensional model can produce under all possible 

1 0 lighting conditions where each point in the linear subspace represents a possible image; and 

1 1 (ii) finding the point on the linear subspace that is closest to the input 

12 image; 

1 3 (e) computing a measure of similarly between the input image and each rendered 

14 image; and 

15 (f) selecting the three-dimensional model corresponding to the rendered image 

16 whose measure of similarity is most similar to the input image. 

1 2. The method of claim 1, wherein step (a) comprises building each three- 

2 dimensional model from a series of images taken under predetermined lighting conditions. 

1 3. The method of claim 1 , wherein step (a) comprises assigning a location to each 

2 point on the surface of each three-dimensional model and at least one corresponding identifier 

3 which identifies the fraction of light that is reflected at each point. 

1 4. The method of claim 3, wherein the at least one corresponding identifier 

2 comprises three albedos, one for each of how much red, blue, and green light is reflected. 
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1 5. The method of claim 1 5 wherein step (b) comprises providing a two- 

2 dimensional input image. 

1 6. The method of claim 1, wherein step (c) comprises aligning predetermined 

2 points on the three-dimensional model and the input image. 

1 7. The method of claim 1 , wherein step (d) is repeated for each of a red, green, 

2 and blue color component for each three-dimensional model. 

1 8. The method of claim 3, wherein step (d)(i) comprises computing polynomials 

2 from descriptions of the direction of the surface normal at each point and from the at least one 

3 corresponding identifier. 

1 9. The method of claim 1, wherein the linear subspace is four-dimensional. 

1 1 0. The method of claim 1 , wherein the linear subspace is nine-dimensional. 

1 11. The method of claim 1 , wherein step (e) comprises determining the 

2 magnitude of the difference between the input image and each rendered image. 

1 12. The method of claim 1, wherein step (d)(ii) comprises computing the point in 

2 the linear subspace that is closest to the input image using a linear projection. 

1 1 3. A method for choosing an image from a plurality of three-dimensional 

2 models which is most similar to an input image; the method comprising the steps of: 

3 (a) providing a database of the plurality of three-dimensional models; 

4 (b) providing an input image; 
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5 (c) positioning each three-dimensional model relative to the input image; 

6 (d) for each three-dimensional model, determining a rendered image that is most 

7 similar to the input image by: 

8 (i) computing a linear subspace that describes an approximation to the set 

9 of all possible rendered images that each three-dimensional model can produce under all possible 

10 lighting conditions where each point in the linear subspace represents a possible image; and 

1 1 (ii) finding a rendered image in a subset of the linear subspace obtained by 

12 projecting the set of images that are generated by positive lights onto the linear subspace; 

13 (e) computing a measure of similarly between the input image and each rendered 

14 image; and 

1 5 (f) selecting the three-dimensional model corresponding to the rendered image 

1 6 which is most similar to the input image. 

1 14. The method of claim 13, wherein step (a) comprises building each three- 

2 dimensional model from a series of images taken under predetermined lighting conditions. 

1 15. The method of claim 1 3, wherein step (a) comprises assigning a location to 

2 each point on the surface of each three-dimensional model and at least one corresponding 

3 identifier which identifies the fraction of light that is reflected at each point. 

1 16. The method of claim 15, wherein the at least one corresponding identifier 

2 comprises three albedos, one for each of how much red, blue, and green light is reflected. 

1 17. The method of claim 13, wherein step (b) comprises providing a two- 

2 dimensional input image. 

1 18. The method of claim 13, wherein step (c) comprises aligning predetermined 

2 points on the three-dimensional models and the input image. 
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1 19. The method of claim 13, wherein step (d) is repeated for each of a red, green, 

2 and blue color component for each three-dimensional model. 

1 20. The method of claim 15, wherein step (d)(i) comprises computing 

2 polynomials from descriptions of the direction of the surface normal at each point and from the 

3 at least one corresponding identifier. 

1 21. The method of claim 13, wherein the linear subspace is four-dimensional. 

1 22. The method of claim 13, wherein the linear subspace is nine-dimensional. 

1 23. The method of claim 13, wherein step (e) comprises determining the 

2 magnitude of the difference between the input image and each rendered image. 

1 24. The method of claim 2 1 , wherein step (d)(ii) comprises rendering an image of 

2 each three-dimensional model using non-negative lighting by solving a sixth order polynomial. 

1 25. The method of claim 22, wherein step (d)(ii) comprises finding the rendered 

2 image that is a convex combination of images generated with light coming from a single 

3 direction, projected onto the nine-dimensional space. 

1 26. The method of claim 25, wherein the rendered image is found using a non- 

2 negative least squares algorithm. 
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LAMBERTIAN REFLECTANCE AND LINEAR SUBSPACES 
ABSTRACT OF THE DISCLOSURE 

A method for choosing an image from a plurality of three-dimensional 
5 models which is most similar to an input image is provided. The method includes the 
steps of: (a) providing a database of the plurality of three-dimensional models; (b) 
providing an input image; (c) positioning each three-dimensional model relative to the 
input image; (d) for each three-dimensional model, determining a rendered image that 
is most similar to the input image by: (d)(i) computing a linear subspace that describes 

1 0 an approximation to the set of all possible rendered images that each three- 
dimensional model can produce under all possible lighting conditions where each 
point in the linear subspace represents a possible image; and one of(d)(ii) finding the 
point on the linear subspace that is closest to the input image or finding a rendered 
image in a subset of the linear subspace obtained by projecting the set of images that 

15 are generated by positive lights onto the linear subspace; (e) computing a measure of 
similarly between the input image and each rendered image; and (f) selecting the 
three-dimensional model corresponding to the rendered image whose measure of 
similarity is most similar to the input image. Step (d) is preferably repeated for each 
of a red, green, and blue color component for each three-dimensional model. The 

20 linear subspace is preferably either four-dimensional or nine-dimensional. 
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